Show language: C# VB.NET Both
See the overview of the Central Event System in general
The following is intended to be viewed side-by-side with the "Plug-in_SiteMap" demo projects available under the install directory. The purpose of the code is to produce a simple map (list of pages and what they link to) of any web-site as it is being imported (crawled).
Create a new class library project (to produce a DLL), it can be named anything that is valid to Visual Studio.
Alternatively a blank plug-in project can be created from the Index Management Tool. Select the Plug-in form, enter the folder to create the project in, choose the required language and click Create Project.
Add references to Keyoti.SearchEngine.Core.DLL, Keyoti.SearchEngine.License.DLL, Keyoti.Text.MSOffice.DLL, Keyoti.Text.LemmaGenerator.DLL (note that you must use the Keyoti2... versions if you use those assemblies in your toolbox - see the notes section 'References To Search DLLs').
Create a new class named 'ExternalEventHandler', and make it in a namespace called 'Keyoti.SearchEngine'. Note, that for VB.NET projects, you will need to edit the project properties and delete the 'Root namespace' setting, in order for the code to work as presented.
using System;
using System.Collections;
using System.Text;
using System.IO;
using Keyoti.SearchEngine.Events;
using Keyoti.SearchEngine.Documents;
using Keyoti.SearchEngine.DataAccess;
using Keyoti.SearchEngine;
namespace Keyoti.SearchEngine
{
///
/// Creates a site-map when a web-site is crawled.
///
public class ExternalEventHandler
{
IEventDispatcher dispatcher;
Configuration conf;
public ExternalEventHandler(IEventDispatcher dispatcher, Configuration conf)
{
}
public void DetachHandlers()
{
}
}
}
Imports Keyoti.SearchEngine.Events
Imports Keyoti.SearchEngine.Documents
Imports Keyoti.SearchEngine.DataAccess
Imports Keyoti.SearchEngine
Imports System.Collections
Imports System.IO
Namespace Keyoti.SearchEngine
'''
''' Creates a site-map when a web-site is crawled.
'''
Public Class ExternalEventHandler
Private dispatcher As IEventDispatcher
Private conf As Configuration
Public Sub New(ByRef dispatcher As IEventDispatcher, ByRef conf As Configuration)
End Sub
Public Sub DetachHandlers()
End Sub
End Class
End Namespace
This is now an empty plug-in, it will compile and the search engine can attach to it, however it will not do anything.
In order to do something, the class needs to attach event handlers in the constructor, and detach those handlers when asked.
public ExternalEventHandler(IEventDispatcher dispatcher, Configuration conf)
{
Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("SiteMapper", "Initialized", conf);
dispatcher.Action += new ActionEventHandler(dispatcher_Action);
dispatcher.NeedObject += new NeedObjectEventHandler(dispatcher_NeedObject);
this.dispatcher = dispatcher;
this.conf = conf;
...
}
public void DetachHandlers()
{
if (dispatcher != null)
{
dispatcher.Action -= new ActionEventHandler(dispatcher_Action);
dispatcher.NeedObject -= new NeedObjectEventHandler(dispatcher_NeedObject);
}
...
}
Public Sub New(ByRef dispatcher As IEventDispatcher, ByRef conf As Configuration)
MyBase.New()
Log.WriteLogEntry("SiteMapper", "Initialized", conf)
AddHandler dispatcher.Action, AddressOf Me.dispatcher_Action
AddHandler dispatcher.NeedObject, AddressOf Me.dispatcher_NeedObject
Me.dispatcher = dispatcher
Me.conf = conf
...
End Sub
Public Sub DetachHandlers()
If (Not (dispatcher) Is Nothing) Then
RemoveHandler dispatcher.Action, AddressOf Me.dispatcher_Action
RemoveHandler dispatcher.NeedObject, AddressOf Me.dispatcher_NeedObject
End If
...
End Sub
The code also writes to a custom log file, named "SiteMapper.txt" (if Logging is enabled in Configuration) and keeps a reference to the configuration (conf) and event dispatcher (dispatcher).
Event handler methods are added, and the constructor/detach methods are completed with calls to create/close a StreamWriter.
using System;
using System.Collections;
using System.Text;
using System.IO;
using Keyoti.SearchEngine.Events;
using Keyoti.SearchEngine.Documents;
using Keyoti.SearchEngine.DataAccess;
using Keyoti.SearchEngine;
namespace Keyoti.SearchEngine
{
///
/// Creates a site-map when a web-site is crawled.
///
public class ExternalEventHandler
{
StreamWriter sw;
IEventDispatcher dispatcher;
Configuration conf;
public ExternalEventHandler(IEventDispatcher dispatcher, Configuration conf)
{
Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("SiteMapper", "Initialized", conf);
dispatcher.Action += new ActionEventHandler(dispatcher_Action);
dispatcher.NeedObject += new NeedObjectEventHandler(dispatcher_NeedObject);
this.dispatcher = dispatcher;
this.conf = conf;
sw = new StreamWriter(Path.Combine(conf.IndexDirectory, "sitemap.txt"), false);
}
public void DetachHandlers()
{
if (dispatcher != null)
{
dispatcher.Action -= new ActionEventHandler(dispatcher_Action);
dispatcher.NeedObject -= new NeedObjectEventHandler(dispatcher_NeedObject);
}
sw.Close();
}
public void dispatcher_Action(object sender, ActionEventArgs e)
{
Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("CustomAssembly", e.ActionData.Name.ToString(), conf);
if (e.ActionData.Name == ActionName.DocumentBeingCrawled)
{
Document document = (e.ActionData.Data as object[])[0] as Document;
ArrayList links = (e.ActionData.Data as object[])[1] as ArrayList;
sw.WriteLine("##################################################################");
sw.WriteLine(document.Uri.AbsoluteUri);
sw.WriteLine("------------------------------------------------------------------");
foreach (Uri link in links)
sw.WriteLine(link.AbsoluteUri);
sw.Flush();
}
}
public void dispatcher_NeedObject(object sender, NeedObjectEventArgs e)
{
}
}
}
Imports Keyoti.SearchEngine.Events
Imports Keyoti.SearchEngine.Documents
Imports Keyoti.SearchEngine.DataAccess
Imports Keyoti.SearchEngine
Imports System.Collections
Imports System.IO
Namespace Keyoti.SearchEngine
'''
''' Creates a site-map when a web-site is crawled.
'''
Public Class ExternalEventHandler
Private sw As StreamWriter
Private dispatcher As IEventDispatcher
Private conf As Configuration
Public Sub New(ByRef dispatcher As IEventDispatcher, ByRef conf As Configuration)
MyBase.New()
Log.WriteLogEntry("SiteMapper", "Initialized", conf)
AddHandler dispatcher.Action, AddressOf Me.dispatcher_Action
AddHandler dispatcher.NeedObject, AddressOf Me.dispatcher_NeedObject
Me.dispatcher = dispatcher
Me.conf = conf
sw = New StreamWriter(Path.Combine(conf.IndexDirectory, "sitemap.txt"), False)
End Sub
Public Sub DetachHandlers()
If (Not (dispatcher) Is Nothing) Then
RemoveHandler dispatcher.Action, AddressOf Me.dispatcher_Action
RemoveHandler dispatcher.NeedObject, AddressOf Me.dispatcher_NeedObject
End If
sw.Close()
End Sub
Public Sub dispatcher_Action(ByVal sender As Object, ByVal e As ActionEventArgs)
Log.WriteLogEntry("CustomAssembly", e.ActionData.Name.ToString, conf)
If (e.ActionData.Name = ActionName.DocumentBeingCrawled) Then
Dim document As document = CType(CType(e.ActionData.Data, Object())(0), document)
Dim links As ArrayList = CType(CType(e.ActionData.Data, Object())(1), ArrayList)
sw.WriteLine("##################################################################")
sw.WriteLine(document.Uri.AbsoluteUri)
sw.WriteLine("------------------------------------------------------------------")
For Each link As Uri In links
sw.WriteLine(link.AbsoluteUri)
Next
sw.Flush()
End If
End Sub
Public Sub dispatcher_NeedObject(ByVal sender As Object, ByVal e As NeedObjectEventArgs)
End Sub
End Class
End Namespace
The site-mapper is only interested in the Action event and in particular actions with name DocumentBeingCrawled (for interest, the example will write all actions received to the log file CustomAssembly.txt if Logging is enabled). When an action named DocumentBeingCrawled occurs, the ActionData.Data object is used to access the document in question and it's links (information about action names and associated data is in the API documentation 'Namespaces'). The document Uri and link list is written to the text file by a StreamWriter.
Now the project is complete, it can be compiled and used. For development and testing, the easiest way to use the plug-in is to create an Index Directory with a configuration setting linking to the DLL. In the demo project this is already done, there is a folder named IndexDirectory under the project which contains only a configuration.xml file. The Configuration.EventHandlerAssemblyPath property is set to the path of the DLL, relative to the index directory (eg. ..\bin\Plug-in_SiteMap_vb.dll). Since any search process working on this Index Directory will use the plug-in DLL, it can be tested very simply;
Due to the nature of DLL loading, there are some handy tips;